Architecture for Hadoop Distributed File Systems

نویسنده

  • S. Devi
چکیده

The Hadoop Distributed File System (HDFS) is designed to store very large data sets reliably, and to stream those data sets at high bandwidth to user applications. In a large cluster, thousands of servers both host directly attached storage and execute user application tasks. By distributing storage and computation across many servers, the resource can grow with demand while remaining economical at every size. In this paper focused on the backend architecture and working of the parts of the hadoop framework which are the map reduce for the Computational and analytics section and the Hadoop distributed file system (HDFS) for the storage section and how the files are shared in the distributed environment.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Comparative Analysis of MapReduce Scheduling Algorithms for Hadoop

Today’s Digital era causes escalation of datasets. These datasets are termed as “Big Data” due to its massive amount of volume, variety and velocity and is stored in distributed file system architecture. Hadoop is framework that supports Hadoop Distributed File System (HDFS)for storing and MapReduce for processing of large data sets in a distributed computing environment. Task assignment is pos...

متن کامل

Adaptive Dynamic Data Placement Algorithm for Hadoop in Heterogeneous Environments

Hadoop MapReduce framework is an important distributed processing model for large-scale data intensive applications. The current Hadoop and the existing Hadoop distributed file system’s rack-aware data placement strategy in MapReduce in the homogeneous Hadoop cluster assume that each node in a cluster has the same computing capacity and a same workload is assigned to each node. Default Hadoop d...

متن کامل

Hadoop Scalability and Performance Testing in Heterogeneous Clusters

This paper aims to evaluate cluster configurations using Hadoop in order to check parallelization performance and scalability in information retrieval. This evaluation will establish the necessary capabilities that should be taken into account specifically on a Distributed File System (HDFS: Hadoop Distributed File System), from the perspective of storage and indexing techniques, and queriy dis...

متن کامل

Investigation of Distributed Search Engine Based on Hadoop

This paper begins with a review on the research status of search engine, followed by discussion on goals of search engine, and then the principle of distributed computing is explained. Consequently the MapReduce distributed computing model and the Hadoop distributed file system (HDFS) are analyzed in detail. Finally the distributed search engine architecture is presented. On the basis of the ar...

متن کامل

A REVIEW: Distributed File System

Data collection in the world are growing and expanding. This capability, it is important that the infrastructure must be able to store a huge collection of data on their number grows every day. The conventional methods are used in data centers for capacity building, costs of software, hardware and management of this process will be very high, today. File system architecture is that, independent...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014